Overview

Dataset statistics

Number of variables84
Number of observations53214
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory17.4 MiB
Average record size in memory343.0 B

Variable types

BOOL70
NUM12
CAT2

Warnings

number_emergency is highly skewed (γ1 = 27.22568857) Skewed
num_procedures has 22880 (43.0%) zeros Zeros
number_outpatient has 46619 (87.6%) zeros Zeros
number_emergency has 49782 (93.6%) zeros Zeros
number_inpatient has 47197 (88.7%) zeros Zeros
numchange has 40766 (76.6%) zeros Zeros

Reproduction

Analysis started2020-09-15 20:20:50.864232
Analysis finished2020-09-15 20:21:30.080002
Duration39.22 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

admission_type_id
Real number (ℝ≥0)

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.128669147
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Memory size415.7 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile6
Maximum8
Range7
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.52929783
Coefficient of variation (CV)0.7184290861
Kurtosis1.680947059
Mean2.128669147
Median Absolute Deviation (MAD)1
Skewness1.526149567
Sum113275
Variance2.338751853
MonotocityNot monotonic
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%) 
12659150.0%
 
31083720.4%
 
2978418.4%
 
637347.0%
 
519653.7%
 
82790.5%
 
716< 0.1%
 
48< 0.1%
 
ValueCountFrequency (%) 
12659150.0%
 
2978418.4%
 
31083720.4%
 
48< 0.1%
 
519653.7%
 
ValueCountFrequency (%) 
82790.5%
 
716< 0.1%
 
637347.0%
 
519653.7%
 
48< 0.1%
 

discharge_disposition_id
Real number (ℝ≥0)

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.233096554
Minimum1
Maximum28
Zeros0
Zeros (%)0.0%
Memory size415.7 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile18
Maximum28
Range27
Interquartile range (IQR)2

Descriptive statistics

Standard deviation4.948578289
Coefficient of variation (CV)1.530600218
Kurtosis8.46417106
Mean3.233096554
Median Absolute Deviation (MAD)0
Skewness2.991388402
Sum172046
Variance24.48842708
MonotocityNot monotonic
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%) 
13470565.2%
 
3631211.9%
 
6607611.4%
 
1817813.3%
 
211572.2%
 
229431.8%
 
56781.3%
 
254820.9%
 
43950.7%
 
73130.6%
 
Other values (11)3720.7%
 
ValueCountFrequency (%) 
13470565.2%
 
211572.2%
 
3631211.9%
 
43950.7%
 
56781.3%
 
ValueCountFrequency (%) 
28480.1%
 
273< 0.1%
 
254820.9%
 
2421< 0.1%
 
232030.4%
 

admission_source_id
Real number (ℝ≥0)

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.601740144
Minimum1
Maximum25
Zeros0
Zeros (%)0.0%
Memory size415.7 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median7
Q37
95-th percentile17
Maximum25
Range24
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.200177537
Coefficient of variation (CV)0.7497987107
Kurtosis1.63122976
Mean5.601740144
Median Absolute Deviation (MAD)0
Skewness1.097477822
Sum298091
Variance17.64149134
MonotocityNot monotonic
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%) 
72758451.8%
 
11675831.5%
 
1737217.0%
 
421734.1%
 
614722.8%
 
27821.5%
 
54020.8%
 
201300.2%
 
3870.2%
 
9810.2%
 
Other values (7)24< 0.1%
 
ValueCountFrequency (%) 
11675831.5%
 
27821.5%
 
3870.2%
 
421734.1%
 
54020.8%
 
ValueCountFrequency (%) 
252< 0.1%
 
223< 0.1%
 
201300.2%
 
1737217.0%
 
142< 0.1%
 

time_in_hospital
Real number (ℝ≥0)

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.196188973
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Memory size415.7 KiB

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q36
95-th percentile10
Maximum14
Range13
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.91533184
Coefficient of variation (CV)0.6947570424
Kurtosis1.125880433
Mean4.196188973
Median Absolute Deviation (MAD)2
Skewness1.21542345
Sum223296
Variance8.499159738
MonotocityNot monotonic
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%) 
3956818.0%
 
2945117.8%
 
1839015.8%
 
4708413.3%
 
550019.4%
 
637477.0%
 
728485.4%
 
820663.9%
 
913762.6%
 
1010932.1%
 
Other values (4)25904.9%
 
ValueCountFrequency (%) 
1839015.8%
 
2945117.8%
 
3956818.0%
 
4708413.3%
 
550019.4%
 
ValueCountFrequency (%) 
144690.9%
 
135741.1%
 
126541.2%
 
118931.7%
 
1010932.1%
 

num_lab_procedures
Real number (ℝ≥0)

Distinct115
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.12943962
Minimum1
Maximum121
Zeros0
Zeros (%)0.0%
Memory size415.7 KiB

Quantile statistics

Minimum1
5-th percentile4
Q131
median44
Q357
95-th percentile74
Maximum121
Range120
Interquartile range (IQR)26

Descriptive statistics

Standard deviation19.93832051
Coefficient of variation (CV)0.4622902752
Kurtosis-0.2967895251
Mean43.12943962
Median Absolute Deviation (MAD)13
Skewness-0.2196322131
Sum2295090
Variance397.5366249
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
116623.1%
 
4313592.6%
 
4412502.3%
 
4511872.2%
 
3811702.2%
 
4711482.2%
 
4611462.2%
 
4011302.1%
 
3710872.0%
 
4110732.0%
 
Other values (105)4100277.1%
 
ValueCountFrequency (%) 
116623.1%
 
25681.1%
 
33830.7%
 
42340.4%
 
51640.3%
 
ValueCountFrequency (%) 
1211< 0.1%
 
1201< 0.1%
 
1181< 0.1%
 
1141< 0.1%
 
1132< 0.1%
 

num_procedures
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.460386364
Minimum0
Maximum6
Zeros22880
Zeros (%)43.0%
Memory size415.7 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile6
Maximum6
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.768805094
Coefficient of variation (CV)1.211189817
Kurtosis0.4700768716
Mean1.460386364
Median Absolute Deviation (MAD)1
Skewness1.1943098
Sum77713
Variance3.128671461
MonotocityNot monotonic
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
02288043.0%
 
11069720.1%
 
2697113.1%
 
3550310.3%
 
630225.7%
 
422724.3%
 
518693.5%
 
ValueCountFrequency (%) 
02288043.0%
 
11069720.1%
 
2697113.1%
 
3550310.3%
 
422724.3%
 
ValueCountFrequency (%) 
630225.7%
 
518693.5%
 
422724.3%
 
3550310.3%
 
2697113.1%
 

num_medications
Real number (ℝ≥0)

Distinct73
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.5756192
Minimum1
Maximum81
Zeros0
Zeros (%)0.0%
Memory size415.7 KiB

Quantile statistics

Minimum1
5-th percentile5
Q110
median14
Q320
95-th percentile31
Maximum81
Range80
Interquartile range (IQR)10

Descriptive statistics

Standard deviation8.390502151
Coefficient of variation (CV)0.5386946127
Kurtosis3.767344485
Mean15.5756192
Median Absolute Deviation (MAD)5
Skewness1.445086226
Sum828841
Variance70.40052634
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1232006.0%
 
1331565.9%
 
1130405.7%
 
1029905.6%
 
1429535.5%
 
1529305.5%
 
927775.2%
 
1627105.1%
 
825164.7%
 
1724504.6%
 
Other values (63)2449246.0%
 
ValueCountFrequency (%) 
11810.3%
 
23250.6%
 
35911.1%
 
49231.7%
 
512882.4%
 
ValueCountFrequency (%) 
811< 0.1%
 
791< 0.1%
 
752< 0.1%
 
741< 0.1%
 
692< 0.1%
 

number_outpatient
Real number (ℝ≥0)

ZEROS

Distinct29
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2658322998
Minimum0
Maximum36
Zeros46619
Zeros (%)87.6%
Memory size415.7 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum36
Range36
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.019417003
Coefficient of variation (CV)3.834812412
Kurtosis163.6789768
Mean0.2658322998
Median Absolute Deviation (MAD)0
Skewness9.167543438
Sum14146
Variance1.039211025
MonotocityNot monotonic
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%) 
04661987.6%
 
134226.4%
 
214302.7%
 
38201.5%
 
44270.8%
 
52030.4%
 
6920.2%
 
7460.1%
 
8450.1%
 
9270.1%
 
Other values (19)830.2%
 
ValueCountFrequency (%) 
04661987.6%
 
134226.4%
 
214302.7%
 
38201.5%
 
44270.8%
 
ValueCountFrequency (%) 
361< 0.1%
 
351< 0.1%
 
331< 0.1%
 
291< 0.1%
 
261< 0.1%
 

number_emergency
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.09061525163
Minimum0
Maximum42
Zeros49782
Zeros (%)93.6%
Memory size415.7 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum42
Range42
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4967912989
Coefficient of variation (CV)5.482424757
Kurtosis1741.902071
Mean0.09061525163
Median Absolute Deviation (MAD)0
Skewness27.22568857
Sum4822
Variance0.2468015947
MonotocityNot monotonic
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%) 
04978293.6%
 
126715.0%
 
24950.9%
 
31540.3%
 
4490.1%
 
522< 0.1%
 
618< 0.1%
 
86< 0.1%
 
75< 0.1%
 
104< 0.1%
 
Other values (6)8< 0.1%
 
ValueCountFrequency (%) 
04978293.6%
 
126715.0%
 
24950.9%
 
31540.3%
 
4490.1%
 
ValueCountFrequency (%) 
421< 0.1%
 
371< 0.1%
 
251< 0.1%
 
201< 0.1%
 
111< 0.1%
 

number_inpatient
Real number (ℝ≥0)

ZEROS

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1696546022
Minimum0
Maximum12
Zeros47197
Zeros (%)88.7%
Memory size415.7 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum12
Range12
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.5882281936
Coefficient of variation (CV)3.467210356
Kurtosis50.67093536
Mean0.1696546022
Median Absolute Deviation (MAD)0
Skewness5.711489778
Sum9028
Variance0.3460124077
MonotocityNot monotonic
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%) 
04719788.7%
 
142488.0%
 
211152.1%
 
33530.7%
 
41670.3%
 
5620.1%
 
6390.1%
 
711< 0.1%
 
810< 0.1%
 
104< 0.1%
 
Other values (3)8< 0.1%
 
ValueCountFrequency (%) 
04719788.7%
 
142488.0%
 
211152.1%
 
33530.7%
 
41670.3%
 
ValueCountFrequency (%) 
122< 0.1%
 
112< 0.1%
 
104< 0.1%
 
94< 0.1%
 
810< 0.1%
 

number_diagnoses
Real number (ℝ≥0)

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.15356861
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Memory size415.7 KiB

Quantile statistics

Minimum1
5-th percentile3
Q15
median8
Q39
95-th percentile9
Maximum16
Range15
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.030155451
Coefficient of variation (CV)0.2837961808
Kurtosis-0.4392293637
Mean7.15356861
Median Absolute Deviation (MAD)1
Skewness-0.6731462681
Sum380670
Variance4.121531154
MonotocityNot monotonic
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%) 
92266142.6%
 
5706813.3%
 
6589511.1%
 
7565210.6%
 
8551510.4%
 
435426.7%
 
319203.6%
 
27341.4%
 
11740.3%
 
1626< 0.1%
 
Other values (6)270.1%
 
ValueCountFrequency (%) 
11740.3%
 
27341.4%
 
319203.6%
 
435426.7%
 
5706813.3%
 
ValueCountFrequency (%) 
1626< 0.1%
 
155< 0.1%
 
143< 0.1%
 
138< 0.1%
 
123< 0.1%
 

max_glu_serum
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
-99
50851 
0
 
1231
1
 
1132
ValueCountFrequency (%) 
-995085195.6%
 
012312.3%
 
111322.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length2.911188785
Min length1

A1Cresult
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
-99
43216 
1
7024 
0
 
2974
ValueCountFrequency (%) 
-994321681.2%
 
1702413.2%
 
029745.6%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length2.624234224
Min length1

metformin
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
41655 
1
11559 
ValueCountFrequency (%) 
04165578.3%
 
11155921.7%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
52573 
1
 
641
ValueCountFrequency (%) 
05257398.8%
 
16411.2%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
52844 
1
 
370
ValueCountFrequency (%) 
05284499.3%
 
13700.7%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
53157 
1
 
57
ValueCountFrequency (%) 
05315799.9%
 
1570.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
50381 
1
 
2833
ValueCountFrequency (%) 
05038194.7%
 
128335.3%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
53213 
1
 
1
ValueCountFrequency (%) 
053213> 99.9%
 
11< 0.1%
 

glipizide
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
46508 
1
6706 
ValueCountFrequency (%) 
04650887.4%
 
1670612.6%
 

glyburide
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
47399 
1
5815 
ValueCountFrequency (%) 
04739989.1%
 
1581510.9%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
53199 
1
 
15
ValueCountFrequency (%) 
053199> 99.9%
 
115< 0.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
49213 
1
 
4001
ValueCountFrequency (%) 
04921392.5%
 
140017.5%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
49763 
1
 
3451
ValueCountFrequency (%) 
04976393.5%
 
134516.5%
 

acarbose
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
53075 
1
 
139
ValueCountFrequency (%) 
05307599.7%
 
11390.3%
 

miglitol
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
53204 
1
 
10
ValueCountFrequency (%) 
053204> 99.9%
 
110< 0.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
53212 
1
 
2
ValueCountFrequency (%) 
053212> 99.9%
 
12< 0.1%
 

tolazamide
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
53190 
1
 
24
ValueCountFrequency (%) 
053190> 99.9%
 
124< 0.1%
 

insulin
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
1
26712 
0
26502 
ValueCountFrequency (%) 
12671250.2%
 
02650249.8%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
52853 
1
 
361
ValueCountFrequency (%) 
05285399.3%
 
13610.7%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
53211 
1
 
3
ValueCountFrequency (%) 
053211> 99.9%
 
13< 0.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
53212 
1
 
2
ValueCountFrequency (%) 
053212> 99.9%
 
12< 0.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
53213 
1
 
1
ValueCountFrequency (%) 
053213> 99.9%
 
11< 0.1%
 

change
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
29787 
1
23427 
ValueCountFrequency (%) 
02978756.0%
 
12342744.0%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
1
39995 
0
13219 
ValueCountFrequency (%) 
13999575.2%
 
01321924.8%
 

readmitted
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size415.7 KiB
0
51061 
1
 
2153
ValueCountFrequency (%) 
05106196.0%
 
121534.0%
 

numchange
Real number (ℝ≥0)

ZEROS

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.248787913
Minimum0
Maximum4
Zeros40766
Zeros (%)76.6%
Memory size415.7 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum4
Range4
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4681279296
Coefficient of variation (CV)1.881634538
Kurtosis2.50601288
Mean0.248787913
Median Absolute Deviation (MAD)0
Skewness1.700747842
Sum13239
Variance0.2191437585
MonotocityNot monotonic
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
04076676.6%
 
11172122.0%
 
26661.3%
 
3580.1%
 
43< 0.1%
 
ValueCountFrequency (%) 
04076676.6%
 
11172122.0%
 
26661.3%
 
3580.1%
 
43< 0.1%
 
ValueCountFrequency (%) 
43< 0.1%
 
3580.1%
 
26661.3%
 
11172122.0%
 
04076676.6%
 

race_Asian
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52797 
1
 
417
ValueCountFrequency (%) 
05279799.2%
 
14170.8%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
1
41354 
0
11860 
ValueCountFrequency (%) 
14135477.7%
 
01186022.3%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52010 
1
 
1204
ValueCountFrequency (%) 
05201097.7%
 
112042.3%
 

race_Other
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52268 
1
 
946
ValueCountFrequency (%) 
05226898.2%
 
19461.8%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52772 
1
 
442
ValueCountFrequency (%) 
05277299.2%
 
14420.8%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52321 
1
 
893
ValueCountFrequency (%) 
05232198.3%
 
18931.7%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
51079 
1
 
2135
ValueCountFrequency (%) 
05107996.0%
 
121354.0%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
47895 
1
5319 
ValueCountFrequency (%) 
04789590.0%
 
1531910.0%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
43583 
1
9631 
ValueCountFrequency (%) 
04358381.9%
 
1963118.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
41177 
1
12037 
ValueCountFrequency (%) 
04117777.4%
 
11203722.6%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
40058 
1
13156 
ValueCountFrequency (%) 
04005875.3%
 
11315624.7%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
45121 
1
8093 
ValueCountFrequency (%) 
04512184.8%
 
1809315.2%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
51851 
1
 
1363
ValueCountFrequency (%) 
05185197.4%
 
113632.6%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
50329 
1
 
2885
ValueCountFrequency (%) 
05032994.6%
 
128855.4%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
49553 
1
 
3661
ValueCountFrequency (%) 
04955393.1%
 
136616.9%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52937 
1
 
277
ValueCountFrequency (%) 
05293799.5%
 
12770.5%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
44679 
1
8535 
ValueCountFrequency (%) 
04467984.0%
 
1853516.0%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52731 
1
 
483
ValueCountFrequency (%) 
05273199.1%
 
14830.9%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52682 
1
 
532
ValueCountFrequency (%) 
05268299.0%
 
15321.0%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52341 
1
 
873
ValueCountFrequency (%) 
05234198.4%
 
18731.6%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52300 
1
 
914
ValueCountFrequency (%) 
05230098.3%
 
19141.7%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
51522 
1
 
1692
ValueCountFrequency (%) 
05152296.8%
 
116923.2%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52742 
1
 
472
ValueCountFrequency (%) 
05274299.1%
 
14720.9%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52705 
1
 
509
ValueCountFrequency (%) 
05270599.0%
 
15091.0%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52603 
1
 
611
ValueCountFrequency (%) 
05260398.9%
 
16111.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52822 
1
 
392
ValueCountFrequency (%) 
05282299.3%
 
13920.7%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
51455 
1
 
1759
ValueCountFrequency (%) 
05145596.7%
 
117593.3%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52855 
1
 
359
ValueCountFrequency (%) 
05285599.3%
 
13590.7%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52949 
1
 
265
ValueCountFrequency (%) 
05294999.5%
 
12650.5%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
28043 
1
25171 
ValueCountFrequency (%) 
02804352.7%
 
12517147.3%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52796 
1
 
418
ValueCountFrequency (%) 
05279699.2%
 
14180.8%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
48994 
1
 
4220
ValueCountFrequency (%) 
04899492.1%
 
142207.9%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
48254 
1
4960 
ValueCountFrequency (%) 
04825490.7%
 
149609.3%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
50570 
1
 
2644
ValueCountFrequency (%) 
05057095.0%
 
126445.0%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
49589 
1
 
3625
ValueCountFrequency (%) 
04958993.2%
 
136256.8%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
49891 
1
 
3323
ValueCountFrequency (%) 
04989193.8%
 
133236.2%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
44608 
1
8606 
ValueCountFrequency (%) 
04460883.8%
 
1860616.2%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
50512 
1
 
2702
ValueCountFrequency (%) 
05051294.9%
 
127025.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
46170 
1
7044 
ValueCountFrequency (%) 
04617086.8%
 
1704413.2%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
45537 
1
7677 
ValueCountFrequency (%) 
04553785.6%
 
1767714.4%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
51011 
1
 
2203
ValueCountFrequency (%) 
05101195.9%
 
122034.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
49189 
1
 
4025
ValueCountFrequency (%) 
04918992.4%
 
140257.6%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
51707 
1
 
1507
ValueCountFrequency (%) 
05170797.2%
 
115072.8%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
52181 
1
 
1033
ValueCountFrequency (%) 
05218198.1%
 
110331.9%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
42570 
1
10644 
ValueCountFrequency (%) 
04257080.0%
 
11064420.0%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
48938 
1
 
4276
ValueCountFrequency (%) 
04893892.0%
 
142768.0%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
48045 
1
5169 
ValueCountFrequency (%) 
04804590.3%
 
151699.7%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

admission_type_iddischarge_disposition_idadmission_source_idtime_in_hospitalnum_lab_proceduresnum_proceduresnum_medicationsnumber_outpatientnumber_emergencynumber_inpatientnumber_diagnosesmax_glu_serumA1Cresultmetforminrepaglinidenateglinidechlorpropamideglimepirideacetohexamideglipizideglyburidetolbutamidepioglitazonerosiglitazoneacarbosemiglitoltroglitazonetolazamideinsulinglyburide-metforminglipizide-metforminmetformin-rosiglitazonemetformin-pioglitazonechangediabetesMedreadmittednumchangerace_Asianrace_Caucasianrace_Hispanicrace_Otherage_[10-20)age_[20-30)age_[30-40)age_[40-50)age_[50-60)age_[60-70)age_[70-80)age_[80-90)age_[90-100)medical_specialty_Emergency/Traumamedical_specialty_Family/GeneralPracticemedical_specialty_Gastroenterologymedical_specialty_InternalMedicinemedical_specialty_Nephrologymedical_specialty_ObstetricsandGynecologymedical_specialty_Orthopedicsmedical_specialty_Orthopedics-Reconstructivemedical_specialty_Othermedical_specialty_Psychiatrymedical_specialty_Pulmonologymedical_specialty_Radiologistmedical_specialty_Surgery-Cardiovascular/Thoracicmedical_specialty_Surgery-Generalmedical_specialty_Surgery-Neuromedical_specialty_Surgery-Vascularmedical_specialty_Unknowmedical_specialty_Urologydiag_1_Diabetesdiag_1_Digestivediag_1_Genitourinarydiag_1_Injurydiag_1_Muscoloskeletaldiag_1_Neoplasmsdiag_1_Othersdiag_1_Respiratorydiag_2_Diabetesdiag_2_Digestivediag_2_Genitourinarydiag_2_Injurydiag_2_Muscoloskeletaldiag_2_Neoplasmsdiag_2_Othersdiag_2_Respiratory
06251141010001-99-9900000000000000000000000001000000000000000000010000000001000000010000000
11173590180009-99-9900000000000000010000110101001000000000000000000000000100000010010000000
21172115132016-99-9900000010000000000000010000000100000000000000000000000100000001010000000
31172441160007-99-9900000000000000010000110101000010000000000000000000000100000010010000000
4117151080005-99-9900000010000000010000110001000001000000000000000000000100000010000000100
52123316160009-99-9900000000000000010000010001000000100000000000000000000100000000000000000
63124701210007-99-9910001000000000010000110001000000010000000000000000000100000000000000000
71175730120008-99-9900000001000000000000010001000000001000000000000000000100000000000000001
821413682280008-99-9900000010000000010000110001000000000100000000000000000100000000000000000
933412333180008-99-9900000000001000010000110001000000000010001000000000000000000000000000100

Last rows

admission_type_iddischarge_disposition_idadmission_source_idtime_in_hospitalnum_lab_proceduresnum_proceduresnum_medicationsnumber_outpatientnumber_emergencynumber_inpatientnumber_diagnosesmax_glu_serumA1Cresultmetforminrepaglinidenateglinidechlorpropamideglimepirideacetohexamideglipizideglyburidetolbutamidepioglitazonerosiglitazoneacarbosemiglitoltroglitazonetolazamideinsulinglyburide-metforminglipizide-metforminmetformin-rosiglitazonemetformin-pioglitazonechangediabetesMedreadmittednumchangerace_Asianrace_Caucasianrace_Hispanicrace_Otherage_[10-20)age_[20-30)age_[30-40)age_[40-50)age_[50-60)age_[60-70)age_[70-80)age_[80-90)age_[90-100)medical_specialty_Emergency/Traumamedical_specialty_Family/GeneralPracticemedical_specialty_Gastroenterologymedical_specialty_InternalMedicinemedical_specialty_Nephrologymedical_specialty_ObstetricsandGynecologymedical_specialty_Orthopedicsmedical_specialty_Orthopedics-Reconstructivemedical_specialty_Othermedical_specialty_Psychiatrymedical_specialty_Pulmonologymedical_specialty_Radiologistmedical_specialty_Surgery-Cardiovascular/Thoracicmedical_specialty_Surgery-Generalmedical_specialty_Surgery-Neuromedical_specialty_Surgery-Vascularmedical_specialty_Unknowmedical_specialty_Urologydiag_1_Diabetesdiag_1_Digestivediag_1_Genitourinarydiag_1_Injurydiag_1_Muscoloskeletaldiag_1_Neoplasmsdiag_1_Othersdiag_1_Respiratorydiag_2_Diabetesdiag_2_Digestivediag_2_Genitourinarydiag_2_Injurydiag_2_Muscoloskeletaldiag_2_Neoplasmsdiag_2_Othersdiag_2_Respiratory
5320414714690160005-99110000001000000010000110201000001000000000000000000000100000010000000100
532053613271290109-99010000010000000010000110001000000001000000001000000000000000100000000000
53206361137766500016-99000000000000000010000110101000000001000000000000000000100000000000000000
53207311313150008-99-9910000001000000010000110000010001000000000000000000000100000001000000100
5320811713512130009-99-9910000000000000010000110100010001000000000000000000000101000000000001000
532091179502330009-99100000001000000010000110101000000001000000000000000000100100000001000000
5321011714736260109-99100000010000000010000110100010001000000000000000000000100010000000100000
532111172466171119-99-9900000000000000010000010000010000010000000000000000000100001000000100000
532121175761220109-99-9900000000000000010000110101000000000100000000000000000100000010000000100
53213117613330009-99-9900000000000000000000000001000000001000000000000000000100100000001000000